Pivot Alignment

نویسنده

  • Lars Borin
چکیده

Word alignment of parallel texts is typically carried out using many kinds of knowledge, or information sources, in concert, i.e., it is profitably viewed as a kind of cooperative process, where e.g. distribution, string similarity, cooccurrence statistics, and other in­ formation sources are used together. We investigate a novel such information source in this paper, namely the use of a third language as a ‘pivot’ to increase alignment recall, hence the name pivot alignment. The results of the preliminary experiments reported here indicate that pivot alignment increases word alignment recall, without sacrificing preci­ sion. We conclude that the method is well worth exploring further, by examining more languages and language combinations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Joint Alignment and Artificial Data Generation: An Empirical Study of Pivot-based Machine Transliteration

In this paper, we first carry out an investigation on two existing pivot strategies for statistical machine transliteration, namely system-based and model-based strategies, to figure out the reason why the previous model-based strategy performs much worse than the system-based one. We then propose a joint alignment algorithm to optimize transliteration alignments jointly across source, pivot an...

متن کامل

You'll Take the High Road and I'll Take the Low Road: Using a Third Language to Improve Bilingual Word Alignment

While language-independent sentence alignment programs typically achieve a recall in the 90 percent range, the same cannot be said about word alignment systems, where normal recall figures tend to fall somewhere between 20 and 40 percent, in the language-independent case. As words (and phrases) for various reasons are more interesting to align than sentences, we need methods to increase word al...

متن کامل

Attempting to Bypass Alignment from Comparable Corpora via Pivot Language

Alignment from comparable corpora usually involves two languages, one source and one target language. Previous works on bilingual lexicon extraction from parallel corpora demonstrated that more than two languages can be useful to improve the alignments. Our works have investigated to which extent a third language could be interesting to bypass the original alignment. We have defined two origina...

متن کامل

Fast Multiple Alignment of Protein Structures Using Conformational Letter Blocks

Most approaches for protein structure alignment start from a search for similar fragments since this local similarity is necessary to the alignment even though is insufficient. In contrary to the sequence alignment, any insignificant trial alignment for structures can be detected by structure superposition and then excluded. It is then practicable to select from locally similar fragments those ...

متن کامل

Multi-Task Word Alignment Triangulation for Low-Resource Languages

We present a multi-task learning approach that jointly trains three word alignment models over disjoint bitexts of three languages: source, target and pivot. Our approach builds upon model triangulation, following Wang et al., which approximates a source-target model by combining source-pivot and pivot-target models. We develop a MAP-EM algorithm that uses triangulation as a prior, and show how...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999